This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.
Reverse Engineer 2 Notebook
Objective: Analyze data from the Blair and Elrich campaigns for county executive from 2018 and 2022.
###Question 1:
First thing we did is load up the sets and clean the data:
##Blair 2018 contribs:
blair_contribs_2018 <-read.csv("data2/blair_contribs_2018.csv")
blair_contribs_2018_cleaned <-blair_contribs_2018 %>%
clean_names
blair_contribs_2018_cleaned %>%
mutate(contribution_date = mdy(contribution_date))
blair_contribs_2018_cleaned
###Blair 2022 contribs
blair_contribs_2022 <-read.csv("data2/blair_contribs_2022.csv")
blair_contribs_2022_cleaned <- blair_contribs_2022 %>%
clean_names
blair_contribs_2022_cleaned %>%
mutate(contribution_date = mdy(contribution_date))
blair_contribs_2022_cleaned
###Blair 2022 expenses
blair_expenses_2022 <-read.csv("data2/blair_expenses_2022.csv")
blair_expenses_2022_cleaned <-blair_expenses_2022 %>%
clean_names
blair_expenses_2022_cleaned %>%
mutate(contribution_date = mdy(expenditure_date))
blair_expenses_2022_cleaned
###Blair 2018 expenses
blair_expenses_2018 <-read.csv("data2/blair_expenses_2018.csv")
blair_expenses_2018_cleaned <-blair_expenses_2018 %>%
clean_names()
blair_expenses_2018_cleaned %>%
mutate(contribution_date = mdy(expenditure_date))
blair_expenses_2018_cleaned
##Elrich 2018 contribs:
elrich_contribs_2018 <-read.csv("data2/elrich_contribs_2018.csv")
elrich_contribs_2018_cleaned <-elrich_contribs_2018 %>%
clean_names()
elrich_contribs_2018_cleaned %>%
mutate(contribution_date = mdy(contribution_date))
elrich_contribs_2018_cleaned
###Elrich Contribs 2022
elrich_contribs_2022 <-read.csv("data2/elrich_contribs_2022.csv")
elrich_contribs_2022_cleaned <-elrich_contribs_2022 %>%
clean_names()
elrich_contribs_2022_cleaned %>%
mutate(contribution_date = mdy(contribution_date))
elrich_contribs_2022_cleaned
###Elrich 2018
elrich_expenses_2018 <-read.csv("data2/elrich_expenses_2018.csv")
elrich_expenses_2018_cleaned <-elrich_expenses_2018 %>%
clean_names()
elrich_expenses_2018_cleaned %>%
mutate(contribution_date = mdy(expenditure_date))
elrich_expenses_2018_cleaned
###Elrich Expenses 2022
elrich_expenses_2022 <-read.csv("data2/elrich_expenses_2022.csv")
elrich_expenses_2022_cleaned <-elrich_expenses_2022 %>%
clean_names()
elrich_expenses_2022_cleaned %>%
mutate(contribution_date = mdy(expenditure_date))
elrich_expenses_2022_cleaned
##County primary election results:
dem_precincts_18 <- read.csv("data2/dem_precincts_2018.csv")
dem_precincts_22 <- read.csv("data2/dem_precincts_2022.csv")
dem_county_22 <- read.csv("data2/dem_county_2022.csv")
dem_county_18 <- read.csv("data2/dem_county_2018.csv")
###Question 1 1. How much money did David Blair and Marc Elrich fund themselves in 2022 vs. 2018? Blair is a businessman and millionaire, so it makes sense that he’s funding his own campaign. But how much exactly is he putting into it, and how does it compare to the past election and to how much Elrich’s campaign is raising?
###2022:
blair_contribs_2022_cleaned %>%
filter(contributor_name == "BLAIR DAVID THOMAS") %>%
group_by(contributor_name) %>%
summarize(total_blair = sum(contribution_amount))
elrich_contribs_2022_cleaned %>%
group_by(contributor_name) %>%
summarize(total_elrich = sum(contribution_amount)) %>%
arrange(desc(total_elrich))
###2018:
blair_contribs_2018_cleaned %>%
filter(contributor_name == "BLAIR DAVID THOMAS") %>%
group_by(contributor_name) %>%
summarize(total_blair = sum(contribution_amount))
elrich_contribs_2018_cleaned %>%
group_by(contributor_name) %>%
summarize(total_elrich = sum(contribution_amount)) %>%
arrange(desc(total_elrich))
###Question 2: Who were the top 5 contributors to Blair and Elrich in 2022? How about in 2018? What are their connections to the candidates?
###2022
top_blair_contribs_22 <- blair_contribs_2018_cleaned %>%
mutate(contributor_name = case_when(
contributor_name == "BLAIR DAVID THOMAS" ~ "Blair David Thomas",
TRUE ~ contributor_name
)) %>%
group_by(contributor_name) %>%
summarize(total = sum(contribution_amount)) %>%
arrange(desc(total)) %>%
head(5)
top_blair_contribs_22
top_elrich_contribs_22 <- elrich_contribs_2022_cleaned %>%
group_by(contributor_name) %>%
summarize(total = sum(contribution_amount)) %>%
arrange(desc(total)) %>%
head(6)
top_elrich_contribs_22
###2018
top_blair_contribs_18 <- blair_contribs_2018_cleaned %>%
mutate(contributor_name = case_when(
contributor_name == "BLAIR DAVID THOMAS" ~ "Blair David Thomas",
TRUE ~ contributor_name
)) %>%
group_by(contributor_name) %>%
summarize(total = sum(contribution_amount)) %>%
arrange(desc(total)) %>%
head(5)
top_blair_contribs_18
top_elrich_contribs_18 <- elrich_contribs_2018_cleaned %>%
group_by(contributor_name) %>%
summarize(total = sum(contribution_amount)) %>%
arrange(desc(total)) %>%
head(6)
top_elrich_contribs_18
###Question 3: David Blair got more early/election day votes in 2022. Elrich got more Mail-in votes in 2022. How did that compare to 2018? People want to know how Marc Elrich won both times, first by 72 voters in 2018 and then 32 in 2022. Where did Marc do well in the three categories: Early voting, election day, and mail-in votes. Did this sway the result at all?
dem_county_18_cleaned <-dem_county_18 %>%
clean_names()
dem_county_22_cleaned <-dem_county_22 %>%
clean_names()
blair <- dem_county_18_cleaned %>%
filter(candidate_name == "David Blair")
elrich <- dem_county_18_cleaned %>%
filter(candidate_name == "Marc Elrich")
blair_elrich <- bind_rows(blair, elrich)
blair_elrich
###Question 4: Which parts of the county voted for Elrich and which voted for Blair based on precinct-level voting? Are there differences in demographics of those areas? What about income?
\
library(sf)
Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE
moco_prec_2022 <- st_read("data2/MontMD_2022/MontMD_2022.shp") %>%
st_zm(drop=TRUE)
Reading layer `MontMD_2022' from data source
`C:\Users\merca\Documents\GitHub\data_journalism_fall_2022\major_assignments\reverse_engineering\data2\MontMD_2022\MontMD_2022.shp'
using driver `ESRI Shapefile'
Simple feature collection with 261 features and 5 fields
Geometry type: MULTIPOLYGON
Dimension: XYZM
Bounding box: xmin: 1162792 ymin: 461729.6 xmax: 1344207 ymax: 614681.3
z_range: zmin: 0 zmax: 0
m_range: mmin: 0 mmax: 0
Projected CRS: NAD83 / Maryland (ftUS)
glimpse(moco_prec_2022)
Rows: 261
Columns: 6
$ NAME <chr> "MONTGOMERY PRECINCT 07-032", "MONTGOMERY PRECINCT 09-015", "MONTGOMERY PRECINCT 13-056", "M…
$ NUMBER <chr> "07-032", "09-015", "13-056", "13-016", "01-006", "04-027", "08-008", "08-016", "10-001", "0…
$ JURSCODE <chr> "MONT", "MONT", "MONT", "MONT", "MONT", "MONT", "MONT", "MONT", "MONT", "MONT", "MONT", "MON…
$ VOTESPRE <chr> "007-032", "009-015", "013-056", "013-016", "001-006", "004-027", "008-008", "008-016", "010…
$ COUNCIL <chr> "1", "3", "6", "4", "7", "6", "7", "7", "1", "6", "7", "1", "1", "7", "4", "3", "3", "2", "6…
$ geometry <MULTIPOLYGON [US_survey_foot]> MULTIPOLYGON (((1289150 485..., MULTIPOLYGON (((1251625 546..., MU…
moco_boundaries <- st_read("data2/moco_boundary.gdb")
Reading layer `CNTY_BNDY' from data source
`C:\Users\merca\Documents\GitHub\data_journalism_fall_2022\major_assignments\reverse_engineering\data2\moco_boundary.gdb'
using driver `OpenFileGDB'
Simple feature collection with 1 feature and 3 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -77.52769 ymin: 38.93425 xmax: -76.88764 ymax: 39.35434
Geodetic CRS: WGS 84
glimpse(moco_boundaries)
Rows: 1
Columns: 4
$ NAME <chr> "MONTGOMERY COUNTY"
$ SHAPE_Length <dbl> 1.895903
$ SHAPE_Area <dbl> 0.1367927
$ SHAPE <MULTIPOLYGON [°]> MULTIPOLYGON (((-77.18523 3...
moco_boundaries %>%
ggplot() +
geom_sf() +
theme_minimal()
moco_prec_2022
Simple feature collection with 261 features and 5 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 1162792 ymin: 461729.6 xmax: 1344207 ymax: 614681.3
Projected CRS: NAD83 / Maryland (ftUS)
First 10 features:
NAME NUMBER JURSCODE VOTESPRE COUNCIL geometry
1 MONTGOMERY PRECINCT 07-032 07-032 MONT 007-032 1 MULTIPOLYGON (((1289150 485...
2 MONTGOMERY PRECINCT 09-015 09-015 MONT 009-015 3 MULTIPOLYGON (((1251625 546...
3 MONTGOMERY PRECINCT 13-056 13-056 MONT 013-056 6 MULTIPOLYGON (((1301382 519...
4 MONTGOMERY PRECINCT 13-016 13-016 MONT 013-016 4 MULTIPOLYGON (((1298182 487...
5 MONTGOMERY PRECINCT 01-006 01-006 MONT 001-006 7 MULTIPOLYGON (((1268938 556...
6 MONTGOMERY PRECINCT 04-027 04-027 MONT 004-027 6 MULTIPOLYGON (((1280700 512...
7 MONTGOMERY PRECINCT 08-008 08-008 MONT 008-008 7 MULTIPOLYGON (((1282944 540...
8 MONTGOMERY PRECINCT 08-016 08-016 MONT 008-016 7 MULTIPOLYGON (((1295564 548...
9 MONTGOMERY PRECINCT 10-001 10-001 MONT 010-001 1 MULTIPOLYGON (((1247972 509...
10 MONTGOMERY PRECINCT 04-026 04-026 MONT 004-026 6 MULTIPOLYGON (((1285712 505...
head(moco_prec_2022)
Simple feature collection with 6 features and 5 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 1246398 ymin: 481181.4 xmax: 1302601 ymax: 556657.3
Projected CRS: NAD83 / Maryland (ftUS)
NAME NUMBER JURSCODE VOTESPRE COUNCIL geometry
1 MONTGOMERY PRECINCT 07-032 07-032 MONT 007-032 1 MULTIPOLYGON (((1289150 485...
2 MONTGOMERY PRECINCT 09-015 09-015 MONT 009-015 3 MULTIPOLYGON (((1251625 546...
3 MONTGOMERY PRECINCT 13-056 13-056 MONT 013-056 6 MULTIPOLYGON (((1301382 519...
4 MONTGOMERY PRECINCT 13-016 13-016 MONT 013-016 4 MULTIPOLYGON (((1298182 487...
5 MONTGOMERY PRECINCT 01-006 01-006 MONT 001-006 7 MULTIPOLYGON (((1268938 556...
6 MONTGOMERY PRECINCT 04-027 04-027 MONT 004-027 6 MULTIPOLYGON (((1280700 512...
write_csv(moco_prec_2022, "data2/moco_prec_2022.csv")
moco_prec_2022 %>%
ggplot() +
geom_sf() +
theme_minimal()
moco_22_pre_results<-read.csv("data2/Moco_22_pre_results.csv")
head(moco_22_pre_results)
moco_22_pre_results_cleaned <-moco_22_pre_results %>%
clean_names()
moco_22_pre_results_cleaned
moco_22_executive<-moco_22_pre_results_cleaned %>%
filter(office_name == "County Executive")
moco_22_executive
moco_22_pre_results_filtered<-moco_22_pre_results_cleaned %>%
filter(candidate_name == "David T. Blair" | candidate_name == "Marc Elrich")
moco_22_pre_results_filtered
blair_results_2022_filter<-moco_22_pre_results_filtered %>%
filter(candidate_name == "David T. Blair")
blair_results_2022_filter
blair_results_2022_filter_total <-blair_results_2022_filter%>%
mutate(total_votes_blair = early_votes + election_night_votes + mail_in_ballot_1_votes + provisional_votes + mail_in_ballot_2_votes)
blair_results_2022_filter_total
blair_results_joined_2022<- inner_join(moco_prec_2022, blair_results_2022_filter_total, by=c("VOTESPRE"="election_district_precinct"))
###Blair Map####
ggplot() +
geom_sf(data=blair_results_joined_2022, aes(fill=total_votes_blair)) +
scale_fill_viridis_b(option="magma")+
theme_minimal()
####Elirich map####
moco_22_pre_results_filtered_elrich <-moco_22_pre_results_cleaned %>%
filter(candidate_name == "Marc Elrich")
moco_22_pre_results_filtered_elrich
elrich_22_filtered_total <-moco_22_pre_results_filtered_elrich%>%
mutate(total_votes_elrich = early_votes + election_night_votes + mail_in_ballot_1_votes + provisional_votes + mail_in_ballot_2_votes)
elrich_22_filtered_total
elrich_results_joined_2022_map<- inner_join(moco_prec_2022, elrich_22_filtered_total, by=c("VOTESPRE"="election_district_precinct"))
ggplot() +
geom_sf(data=elrich_results_joined_2022_map, aes(fill=total_votes_elrich)) +
scale_fill_viridis_b(option="magma")+
theme_minimal()
###percentage_change***
blair_results_2022_filter_total
elrich_22_filtered_total
both_joined_2022<- left_join(blair_results_2022_filter_total, elrich_22_filtered_total, by=c("county"="county"))
both_joined_2022
both_joined_mutated<-both_joined_2022 %>%
mutate(difference = total_votes_elrich-total_votes_blair)
both_joined_mutated
NA
both_joined_map<- inner_join(moco_prec_2022, both_joined_mutated, by=c("VOTESPRE"="election_district_precinct.x"))
ggplot() +
geom_sf(data=both_joined_map, aes(fill=difference)) +
scale_fill_viridis_b(option="magma")+
theme_minimal()
###percentage_change***
moco_22_executive <- moco_22_pre_results_cleaned %>%
filter(office_name == "County Executive") %>%
mutate(candidate_total = early_votes + election_night_votes + mail_in_ballot_1_votes + mail_in_ballot_2_votes + provisional_votes)
moco_combining_candidates_22 <- moco_22_executive %>%
group_by(election_district_precinct) %>%
summarise(total_precinct_votes = sum(early_votes + election_night_votes + mail_in_ballot_1_votes + mail_in_ballot_2_votes + provisional_votes))
exec_combined_joined <- left_join(moco_22_executive, moco_combining_candidates_22)
Joining, by = "election_district_precinct"
exec_combined_joined
exec_joined_percentages <- exec_combined_joined %>%
mutate(candidate_percents = candidate_total/total_precinct_votes*100)
exec_joined_percentages
pivot_candidates <- exec_joined_percentages %>%
select(election_district_precinct, candidate_name, candidate_percents, total_precinct_votes) %>%
pivot_wider(names_from = candidate_name, values_from = candidate_percents, values_fill = 0) %>%
clean_names()
pivot_candidates_joined<- inner_join(moco_prec_2022, pivot_candidates, by=c("VOTESPRE"="election_district_precinct")) %>%
mutate(blair_elrich_diff = david_t_blair - marc_elrich)
#Creating precinct percentage maps -negatives are where elrich won, so the dark
ggplot() +
geom_sf(data=pivot_candidates_joined, aes(fill=blair_elrich_diff)) +
scale_fill_viridis_b(option="magma")+
theme_minimal()
Blair-Elrich Map diff:
First impressions of this map show me that Marc Elrich won in the raciall diverse areas of the county. This includes Silver Spring, Takoma Park (his home), Rockville, parts of Bethesda, and interestingly some of the white-dominated suburbs in the eastern part of the county such as Ashton, Sandy Spring, Colesville, Cloverly, and others. Elrich also won the western peripheries of the county such as Poolesville and Barnesville.
As for Blair, he won what I would characterize as “the rich” areas of the county, which includ Chevy Chase, and Potomac. Interestingly, When looking at other places that Blair won, he also won places like Rockville, Gaithersburg, Germantown, Olney, and the upper sparsly-populated portions of the county. Income-wise, these areas are not as uniform in their high median incomes like Chevy Chase and Potomac.
One interesting place where Blair one was in the eastern part of the county amonf a swath of Elrich votes. This is Kemp Mill. According to Census figures found at https://www.census.gov/quickfacts/kempmillcdpmaryland, the medium household income is $140,000. But that doesn’t answer the question of why they voted for Blair over Elrich. Another reason could be its demographics. The area is very Jewish and, according to the Washington Examiner, https://www.washingtonexaminer.com/tightly-knit-kemp-mill, has a very large orthodox Jewish voting pattern. Its unclear if this had any impact on the race since its unknown what Blair’s religion is. Elrich himself is Jewish, but in this case, the Orthodox community may not have voted for him? Its hard to say. Other largely jewish areas of the county such as Chevy Chase, parts of Rockville, and Potomac also voted for Blair.
pivot_candidates_joined<- inner_join(moco_prec_2022, pivot_candidates, by=c("VOTESPRE"="election_district_precinct")) %>%
mutate(elrich_blair_diff = david_t_blair - marc_elrich) %>%
filter(elrich_blair_diff >= 10)
pivot_candidates_joined
Simple feature collection with 66 features and 11 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 1214839 ymin: 463836.8 xmax: 1327770 ymax: 614681.3
Projected CRS: NAD83 / Maryland (ftUS)
First 10 features:
NAME NUMBER JURSCODE VOTESPRE COUNCIL total_precinct_votes david_t_blair marc_elrich
1 MONTGOMERY PRECINCT 07-032 07-032 MONT 007-032 1 380 54.47368 30.00000
2 MONTGOMERY PRECINCT 09-015 09-015 MONT 009-015 3 351 48.43305 32.47863
3 MONTGOMERY PRECINCT 01-006 01-006 MONT 001-006 7 288 48.61111 37.84722
4 MONTGOMERY PRECINCT 08-008 08-008 MONT 008-008 7 602 47.17608 35.54817
5 MONTGOMERY PRECINCT 10-001 10-001 MONT 010-001 1 264 50.00000 31.06061
6 MONTGOMERY PRECINCT 01-001 01-001 MONT 001-001 7 315 47.30159 33.65079
7 MONTGOMERY PRECINCT 04-030 04-030 MONT 004-030 3 704 47.15909 35.79545
8 MONTGOMERY PRECINCT 09-003 09-003 MONT 009-003 3 309 46.92557 33.00971
9 MONTGOMERY PRECINCT 06-002 06-002 MONT 006-002 1 548 50.18248 29.37956
10 MONTGOMERY PRECINCT 10-002 10-002 MONT 010-002 1 461 60.73753 25.59653
peter_james hans_riemer geometry elrich_blair_diff
1 1.0526316 14.47368 MULTIPOLYGON (((1289150 485... 24.47368
2 3.1339031 15.95442 MULTIPOLYGON (((1251625 546... 15.95442
3 1.3888889 12.15278 MULTIPOLYGON (((1268938 556... 10.76389
4 1.4950166 15.78073 MULTIPOLYGON (((1282944 540... 11.62791
5 0.7575758 18.18182 MULTIPOLYGON (((1247972 509... 18.93939
6 1.9047619 17.14286 MULTIPOLYGON (((1269018 581... 13.65079
7 1.7045455 15.34091 MULTIPOLYGON (((1274243 511... 11.36364
8 2.2653722 17.79935 MULTIPOLYGON (((1243437 540... 13.91586
9 0.3649635 20.07299 MULTIPOLYGON (((1241771 495... 20.80292
10 0.8676790 12.79826 MULTIPOLYGON (((1243245 479... 35.14100
ggplot() +
geom_sf(data=moco_prec_2022) +
geom_sf(data=pivot_candidates_joined, aes(fill=elrich_blair_diff)) +
scale_fill_viridis_b(option="magma")+
theme_minimal()
Map two:
This map shows where the the percentage between Blair and Elrich was larger than 20%. What I see here is that the majority of the county was competitive, regardless of where you were. Blair won by somewhat big margins in areas like Potomac, which doesn’t surprise me. Potomac has one of the highest median incomes in the county and Blair himself is a businessman millionaire. The other areas he somewhat won by a little bigger margin are Kemp Mill in the eastern part of the county, and upper Montgomvery County near Mt. Airy.
library(ggplot2)
###Question 5: Where did David Blair and Marc Elrich spend their money on campaign finance resources? Ie: Meta advertisements, yard signs, TV, consulting? This would be interesting to know since Blair made a very big concerted effort to build his profile during the campaign while Elrich relied on incumbency and mainly TV ads to help him.
blair_expenses_2022_cleaned %>%
group_by(expense_category) %>%
summarize(total_category = sum(amount)) %>%
arrange(desc(total_category))
#spent nearly $3.1 mil in 2022 ... top 3 are Media, Salarties and Direct Mail by Mail House (R)
#dive into media
blair_expenses_2022_cleaned %>%
filter(expense_category == "Media") %>%
group_by(expense_purpose) %>%
summarize(total_category = sum(amount)) %>%
arrange(desc(total_category))
elrich_expenses_2022_cleaned %>%
group_by(expense_category) %>%
summarize(total_category = sum(amount)) %>%
arrange(desc(total_category))
Elrich spent $528393.29 on media which is significantly less than Blair
elrich_expenses_2022_cleaned %>%
filter(expense_category == "Media") %>%
group_by(expense_purpose) %>%
summarize(total_category = sum(amount)) %>%
arrange(desc(total_category))
#Elrich spent 32K on consulting fees.
This is what I wrote for No. 5 “There are signifant differences in spending between Elrich and Blair here and it shows how the capital on hand can really influence the means a candidate has to win or lose a race. Elrich in 2022 went all-in on TV spending. Blair spent a lot more in 2018 as well.”